An Automatic Segmentation Algorithm Based on Chinese Phoneme
نویسندگان
چکیده
The primary task of Chinese language processing is to establish efficient and accurate segmentation strategy. With the Chinese’s characteristics of been idea-phonetic language, the paper advances an automatic segmentation algorithm that is based on Chinese phoneme to realize disambiguation. First, the candidate tag set, which consists of ambiguous phrases that result from Chinese polyphones, is built up, and every possible segmentation result of each phrase compose the segmentation tag set, then, the calculation of posterior probability is transformed into solving optimization problem, and with genetic algorithm to get the optimal solution, furthermore, the approach also resolve the sparse data problem in HMM. The experiment shows that with this method to solve the ambiguity caused by polyphones is practicable and has a good effect.
منابع مشابه
Automatic Prostate Cancer Segmentation Using Kinetic Analysis in Dynamic Contrast-Enhanced MRI
Background: Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) provides functional information on the microcirculation in tissues by analyzing the enhancement kinetics which can be used as biomarkers for prostate lesions detection and characterization.Objective: The purpose of this study is to investigate spatiotemporal patterns of tumors by extracting semi-quantitative as well as w...
متن کاملTowards A Phoneme Labeled Mandarin Chinese Speech Corpus
Phoneme level transcription of speech corpora is crucial to fundamental speech research and the increasingly interested detection-based automatic speech recognition. Currently, there is no existing phoneme-labeled Mandarin Chinese speech corpus. This paper presents our recent work towards development of such a corpus. Our goal is to label five hours of speech data selected from a Mandarin Chine...
متن کاملImproved HMM/SVM methods for automatic phoneme segmentation
This paper presents improved HMM/SVM methods for a twostage phoneme segmentation framework, which tries to imitate the human phoneme segmentation process. The first stage performs hidden Markov model (HMM) forced alignment according to the minimum boundary error (MBE) criterion. The objective is to align a phoneme sequence of a speech utterance with its acoustic signal counterpart based on MBE-...
متن کاملAdditional use of phoneme duration hypotheses in automatic speech segmentation
In this paper, we describe a new approach for speaker independent automatic phoneme alignment. Typical algorithms for this task use only phoneme-to-frame similarity measures which are somehow maximised or minimised. In addition to such similarity measures, we use phoneme duration hypotheses generated by the speech synthesis system HADIFIX [1]. For algorithms based on dynamic programming, it is ...
متن کاملA constrained baum-welch algorithm for improved phoneme segmentation and efficient training
We describe an extension to the Baum-Welch algorithm for training Hidden Markov Models that uses explicit phoneme segmentation to constrain the forward and backward lattice. The HMMs trained with this algorithm can be shown to improve the accuracy of automatic phoneme segmentation. In addition, this algorithm is significantly more computationally efficient than the full BaumWelch algorithm, whi...
متن کامل